198

14

The Nature of Living Things

to explore (via mutations) neighbouring (in sequence space) genomes. Hence, bioin-

formatics (applied to genomics) needs a higher level theory than that provided by

existing information theory. An important, although long-term, task of bioinformat-

ics is to determine how biological genomes are chosen such that they are suited to

their tasks, encompassing such aspects.

Unreliable DNA polymerase is a distinct advantage for producing new antibodies

(somatic hypermutation) and for viruses needing to mutate rapidly in order to evade

host defences—provided it is not too unreliable: Eigen (1976) has shown that in a

soup of self-replicating molecules, there is a replication error rate threshold above

which an initially diverse population of molecules cannot converge onto a stable,

optimally replicating one (a quasi-species 45).

Problem. What are the implications of a transcription error rate estimated as 1 in10 Superscript 5105?

(In contrast, the error rate of DNA replication is estimated as 1 in 10 Superscript 101010.) Calculate

the proportion of proteins containing the wrong amino acids due to mistakes in tran-

scription, assuming that translation is perfect. Compare the result with a translation

error rate estimated as 1 in 3000.

Problem. Explore the suggestion that the quality of a channel (such as a telephone

line) is independent of the actual message.

14.7.3

Recombination

Homologous recombination is a key process in

genetics, whereby the rearrange-

ment of genes can take place. It involves the exchange of genetic material between

two sets of parental DNA during meiosis (Sect. 14.4.1). The mechanism of recogni-

tion and alignment of homologous (i.e., with identical, or almost identical, nucleotide

sequences) sections of duplex (double-stranded) DNA is far less clear than the recog-

nition between complementary single strands; it may depend on the pattern of elec-

trostatically charged (ionized) phosphates, which itself depends slightly but probably

sufficiently on sequence, and can be further modulated by (poly)cations adsorbed on

the surface of the duplex. 46

Following the alignment, the breakage of the DNA takes place, and the broken

ends are then shuffled to produce new combinations of genes; for example, consider

a hypothetical replicated pair of chromosomes, with the dominant gene written in

45 A quasi-species may be defined as a cluster of genomes in sequence space, the diameter of the

cluster being sufficiently small such that almost every sequence can “mate” with every other one

and produce viable offspring. The sequence at the centre of the cluster is called the master sequence.

If the error rate is above the threshold, in principle all possible sequences will be found. See also

Sect. 4.1.2.

46 Kornyshev and Leikin (2001).